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AMENDMENTS 

In the Claims 

The following is a marked-up version of the claims with the language that is underlined 
(" ") being added and the language that contains strikethrough (" — ") being deleted: 

1 . (Currently Amended) A method comprising: 

(A) receiving an email message from a simple mail transfer protocol (SMTP) server, the 
email message comprising displaying characters and non-displaying characters, the non- 
displavinq characters including non-displaying comments and non-displaying control characters; 
the email message further comprising: 

(A1 ) a 32-bit string indicative of the length of the email message; 
(A2) a text body; 

(A3) an SMTP email address; that includes a user name and a domain name; 
(A4) — a doma i n namo correspond i ng to tho SMTP ema il address; 
(A§)(A4) an attachment; 

(B) searching for the non-displaying characters in the email; 

(C) removing the searched non-displaying characters; characters, including the non- 
displaying comments and the non-displaying control characters; 

(D) determining non-alphabetic displaying characters in the ema il ; email, where determining 
non-alphabetic displaying characters includes a per-character analysis that recursively 
determines for each character whether: 

(DP a character is a non-alphabetic character; 

(D2) if the character is a non-alphabetic character, whether the character is a space; 
(D3) if the character is a space, determine whether the space is adjacent to a solitary 
"i" or "a": and 
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(E) (D4) if the non-alphabetic character is not a space, filtering the determined 
non-alphabetic displaying characters from the email; 

(F) (E) generating a phonetic equivalent for each word that includes only alphabetic displaying 
characters that has a phonetic equivalent; 

(G) (F) tokenizing the phonetic equivalents in the displaying text body to generate tokens 
representative of words in the text; 

(H) (G) tokenizing the SMTP email address to generate a token representative of the SMTP 
email address; 

W (H) tokenizing the domain name to generate a token that is representative domain name; 

(J) (I) tokenizing the attachment to generate a token that is representative of the attachment, 

wherein tokenizing comprises: 

fW-) (it) generating a 128-bit MD5 hash of the attachment; 

{12} appending the 32-bit string to the generated MD5 hash to produce a 1 60- 
bit number; and 

(13) UUencoding the 1 60-bit number to generate the token representative of 
the attachment; 

(K) (J) determining a spam probability value for each of the generated tokens; 

(L)(K) sorting the generated tokens in accordance with the corresponding determined spam 

probability value to determine a predefined number of interesting tokens, the predefined 

number of interesting tokens being a subset of the generated tokens; 
(L) classifying the generated tokens as spam, non-spam, or neutral; 
(M) selecting the predefined number of interesting tokens, the interesting tokens being the 

generated tokens having the greatest non-neutral probability values; 
(N) performing a Bayesian analysis on the selected interesting tokens to generate a spam 

probability; and 
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(O) categorizing the email message as a function of the generated spam probability. 
2.-5. (Canceled) 

6. (Currently Amended) A method comprising: 

roco i v i ng receiving, at a computing device, an email message comprising a text body, an 
SMTP email address, an attachment, and a domain name corresponding to the SMTP email 
address, the text body including displaying characters and non-displaying characters; 

searching for the non-displaying characters in the email; 

removing the searched non-displaying charactors; characters, including non-displaying 
comments and non-displaying control characters; 

tokenizing the SMTP email address to generate a token representative of the displaying 
characters of the SMTP email address; 

tokenizing the attachment to generate a token that is representative of the attachment; 

tokenizing the domain name to generate a token representative of the domain name; 

determining a spam probability value from the generated tokens; and 

sorting the generated tokens in accordance with the corresponding determined spam 
probability value to determine a predefined number of interesting tokens, the predefined number 
of interesting tokens being a subset of the generated tokens. 



7.-10. (Canceled) 
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1 1 . (Previously Presented) The method of claim 6, wherein determining the spam 
probability comprises: 

assigning a spam probability value to the token representative of the SMTP email 
address; 

assigning a spam probability value to the token representative of the domain name; and 
generating a Bayesian probability value using the spam probability values assigned to 
the tokens. 

12. (Previously Presented) The method of claim 1 1 , wherein determining the spam 
probability further comprises: 

comparing the generated Bayesian probability value with a predefined threshold value. 

13. (Previously Presented) The method of claim 12, wherein determining the spam 
probability further comprises: 

categorizing the email message as spam in response to the Bayesian probability value 
being greater than the predefined threshold. 

14. (Previously Presented) The method of claim 12, wherein determining the spam 
probability further comprises: 

categorizing the email message as non-spam in response to the Bayesian probability 
value being not greater than the predefined threshold. 



15. (Canceled) 
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16. (Previously Presented) The method claim 6, wherein receiving the email 
message further comprises: 

receiving an email message including a text body. 

17. (Previously Presented) The method of claim 1 6, further comprising: 
tokenizing the words in the text body to generate tokens representative of the words in 

the text body. 

18. (Canceled) 

19. (Previously Presented) The method of claim 17, wherein determining the spam 
probability comprises: 

assigning a spam probability value to each of the tokens representative of the words in 
the text body; 

assigning a spam probability value to the token representative of the attachment; and 
generating a Bayesian probability value using the spam probability values assigned to 
the tokens. 

20. (Previously Presented) The method of claim 19, wherein determining the spam 
probability further comprises: 

comparing the generated Bayesian probability value with a predefined threshold value. 
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21 . (Previously Presented) The method of claim 20, wherein determining the spam 
probability further comprises: 

categorizing the email message as spam in response to the Bayesian probability value 
being greater than the predefined threshold. 



22. (Previously Presented) The method of claim 20, wherein determining the spam 
probability further comprises: 

categorizing the email message as non-spam in response to the Bayesian probability 
value being not greater than the predefined threshold. 
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23. (Currently Amended) A system comprising: 

a memory component that stores at least the following: 

email receive logic configured to receive an email message comprising an SMTP 
email address, a domain name corresponding to the SMTP email address, and an address, 
attachment, the email message further including displaying characters and non-displaying 
characters; 

searching logic configured to search for the non-displaying characters in the 

email; 

removing logic configured to remove the searched non-displaying characters; 
characters, including non-displaying comments and the non-displaying control characters; 

tokenize logic configured to tokenize the SMTP email address to generate a 
token representative of the SMTP email address; 

tokenize logic configured to tokenize the attachment to generate a token that is 
representative of the attachment; 

tokenize logic configured to tokenize the domain name to generate a token 
representative of the domain name; 

analysis logic configured to determine a spam probability value from the 
generated tokens; and 

sorting logic configured to sort the generated tokens in accordance with the 
corresponding determined spam probability value to determine a predefined number of 
interesting tokens, the predefined number of interesting tokens being a subset of the generated 
tokens, wherein only displaying characters are tokenized. 
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24. (Currently Amended) A system comprising: 

means for receiving an email message comprising an SMTP email address, a domain 
name corresponding to the SMTP email address, and an addross, attachment, the email 
message further including displaying characters and non-displaying characters; 

means for searching for the non-displaying characters in the email; 

means for removing the searched non-displaying charactors; characters, including the 
non-displaying comments and the non-displaying control characters; 

means for tokenizing the SMTP email address to generate a token representative of the 
SMTP email address; 

means for tokenizing the attachment to generate a token that is representative of the 
attachment; 

means for tokenizing the domain name to generate a token representative of the domain 

name; 

means for determining a spam probability value from the generated tokens; and 
means for sorting the generated tokens in accordance with the corresponding 
determined spam probability value to determine a predefined number of interesting tokens, the 
predefined number of interesting tokens being a subset of the generated tokens, wherein only 
displaying characters are tokenized. 
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25. (Currently Amended) A computer-readable storage medium that includes a 
program that, when executed by a computer, performs at least the following: 

receive an email message comprising an SMTP email address, a domain name 
corresponding to the SMTP email address, and an addross, attachment, the email message 
further including displaying characters and non-displaying characters; 

search for the non-displaying characters in the email; 

remove the searched non-displaying charactors; characters, including the non-displaying 
comments and the non-displaying control characters; 

tokenize the SMTP email address to generate a token representative of the SMTP email 
address; 

tokenize the attachment to generate a token that is representative of the attachment; 

tokenize the domain name to generate a token representative of the domain name; 

determine a spam probability value from the generated tokens; and 

sort the generated tokens in accordance with the corresponding determined spam 
probability value to determine a predefined number of interesting tokens, the predefined number 
of interesting tokens being a subset of the generated tokens, wherein only displaying characters 
are tokenized. 

26. (Currently Amended) The computer-readable storage medium of claim 25, the 
program further causing the computer to perform at least the following: 

assign a spam probability value to the token representative of the SMTP email address; 
assign a spam probability value to the token representative of the domain name; and 
generate a Bayesian probability value using the spam probability values assigned to the 

tokens. 
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27. (Currently Amended) The computer-readable storage medium of claim 26, the 
program further causing the computer to perform at least the following: 

compare the generated Bayesian probability value with a predefined threshold value. 

28. (Currently Amended) The computer-readable storage medium of claim 27, the 
program further causing the computer to perform at least the following: 

categorize the email message as spam in response to the Bayesian probability value 
being greater than the predefined threshold. 

29. (Currently Amended) The computer-readable storage medium of claim 27, the 
program further causing the computer to perform at least the following: 

categorize the email message as non-spam in response to the Bayesian probability 
value being not greater than the predefined threshold. 
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30. (Currently Amended) A system comprising: 
a memory component that stores at least the following: 

email receive logic configured to receive an email message comprising an 
attachment and an address, the email message further including displaying characters and non- 
displaying characters; 

search logic configured to search for the non-displaying characters in the email; 

remove logic configured to remove the searched non-displaying charactors; 
characters, including the non-displaying comments and the non-displaying control characters; 

tokenize logic configured to tokon i zo the attachment to generate a token 
representative of the attachment; 

analysis logic configured to determine a spam probability value from the 
generated token; and 

sort logic configured to sort the generated tokens in accordance with the 
corresponding spam probability value to determine a predefined number of interesting tokens, 
the predefined number of interesting tokens being a subset of the generated tokens, wherein 
only displaying characters are tokenized. 
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31 . (Currently Amended) A system comprising: 

means for receiving an email message comprising an attachment and an address, the 
email message further including displaying characters and non-displaying characters; 

means for searching for the non-displaying characters in the email; 

means for removing the searched non-displaying charactors; characters, including the 
non-displaying comments and the non-displaying control characters; 

means for tokon i z i ng tho attachment to gonorato generating a token representative of 
the attachment; 

means for determining a spam probability value from the generated token; and 
means for sorting the generated tokens in accordance with the corresponding 
determined spam probability value to determine a predefined number of interesting tokens, the 
predefined number of interesting tokens being a subset of the generated tokens, wherein only 
displaying characters are tokenized. 
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32. (Currently Amended) A computer-readable storage medium that includes a 
program that, when executed by a computer, performs at least the following: 

receive an email message comprising an attachment and an address, the email 
message further including displaying characters and non-displaying characters; 
search for the non-displaying characters in the email; 

remove the searched non-displaying charactors; characters, including the non-displaying 
comments and the non-displaying control characters; 

tokon i zo tho attachment to generate a token representative of the attachment; 

determine a spam probability value from the generated token; and 

sort the generated tokens in accordance with the corresponding determined spam 
probability value to determine a predefined number of interesting tokens, the predefined number 
of interesting tokens being a subset of the generated tokens, wherein only displaying characters 
are tokenized. 

33. (Currently Amended) The computer-readable storage medium of claim 32, the 
program further causing the computer to perform at least the following: 

receive an email message having a text body. 

34. (Currently Amended) The computer-readable storage medium of claim 33, the 
program further causing the computer to perform at least the following: 

tokenize the words in the text body to generate tokens representative of the words in the 
text body. 
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35. (Currently Amended) The computer-readable storage medium of claim 34, 
assign a spam probability value to each of the tokens representative of the words in the 

text body; 

assign a spam probability value to the token representative of the attachment; and 
generate a Bayesian probability value using the spam probability values assigned to the 

tokens. 

36. (Currently Amended) The computer-readable storage medium of claim 35, the 
program further causing the computer to perform at least the following: 

compare the generated Bayesian probability value with a predefined threshold value. 

37. (Currently Amended) The computer-readable storage medium of claim 36, the 
program further causing the computer to perform at least the following: 

categorize the email message as spam in response to the Bayesian probability value 
being greater than the predefined threshold. 

38. (Currently Amended) The computer-readable storage medium of claim 36, the 
program further causing the computer to perform at least the following: 

categorize the email message as non-spam in response to the Bayesian probability 
value being not greater than the predefined threshold. 

39. (Previously Presented) The method of claim 1 , wherein the email is received at a 
computing device. 



